Tests of Different Regularization Terms in Small Networks

نویسندگان

  • José Luis Crespo
  • Eduardo Mora
چکیده

Several regularization terms, some of them widely applied to neural networks, such as weight decay and weight elimination, and some others new, are tested when applied to networks with a small number of connections handling continuous variables. These networks are found when using additive algorithms that work by adding processors. First the different methods and their rationale is presented. Then, results are shown, first for curve fitting problems. Since the network constructive algorithm is being used for system modeling, results are also shown for a toy problem that includes recurrency buildup, in order to test the influence of the regularization terms in this process. The results show that this terms can be of help in order to detect unnecessary connections. No clear winner has been found among the presented terms in these tests. Background An automatic network construction method, described elsewhere [Crespo, 1992], has been developed for system modeling purposes. This method includes overfitting control as a way of limiting the network size, and builds up an hybrid network with feedback that is easy to interpret in a useful way. It works by adding processors and fully connecting each one before trying the new network. This may add unnecessary connections, so, in order to overcome this problem, regularization techniques have been tried. The idea is to control the weights value in order to avoid considering variables that may be non relevant for a particular task, or a particular processor; this control would suppress non needed dependencies added during the network setup process. One of the possibilities of the algorithm is adding feedback, and eventually it should select the right lagged terms to be considered in both independent variables and output. The regularization terms influence in this selection process have also been investigated. Options In order to test their efficiency, several methods are presented below. Each one adds a particular term to the objective function being minimized. The most usual methods and the corresponding terms are: Weight decay [Hinton, 1986]: λ w ∑ where w represents the weights. Weight elimination [Weigend et al, 1991]:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Strained Virtual Movement Field

We provide a novel thinking of regularization neural networks. We smooth the objective of neural networks w.r.t small adversarial perturbations of the inputs. Different from previous works, we assume the adversarial perturbations are caused by the movement field. When the magnitude of movement field approaches 0, we call it virtual movement field. By introducing the movement field, we cast the ...

متن کامل

Adding noise to the input of a model trained with a regularized objective

Regularization is a well studied problem in the context of neural networks. It is usually used to improve the generalization performance when the number of input samples is relatively small or heavily contaminated with noise. The regularization of a parametric model can be achieved in different manners some of which are early stopping (Morgan and Bourlard, 1990), weight decay, output smoothing ...

متن کامل

Forecasting Gold Price using Data Mining Techniques by Considering New Factors

Gold price forecast is of great importance. Many models were presented by researchers to forecast gold price. It seems that although different models could forecast gold price under different conditions, the new factors affecting gold price forecast have a significant importance and effect on the increase of forecast accuracy. In this paper, different factors were studied in comparison to the p...

متن کامل

Extraction of crisp logical rules using constrained backpropagation networks

Two recently developed methods for extraction of crisp logical rules from neural networks trained with backpropagation algorithm are compared. Both methods impose constraints on the structure of the network by adding regularization terms to the error function. Networks with minimal number of connections are created, leading to a small number of crisp logical rules. The two methods are compared ...

متن کامل

Comparison of the performances of neural networks specification, the Translog and the Fourier flexible forms when different production technologies are used

This paper investigates the performances of artificial neural networks approximation, the Translog and the Fourier flexible functional forms for the cost function, when different production technologies are used. Using simulated data bases, the author provides a comparison in terms of capability to reproduce input demands and in terms of the corresponding input elasticities of substitution esti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993